Character Encodings and Their Internet Names
Table C-1 lists character encodings for various languages, gives some of their common Internet names, and identifies the version of the Text Encoding Conversion Manager for which character encoding was first supported for use by the Text Encoding Converter and the Unicode Converter. In the last two columns of the table, “N/A” means that the encoding is not supported.Table C-1 Character encoding Internet names and availability in Mac OS
Character encoding Common Internet names Related information Version of Text Encoding Conversion Manager that first offered support in: Text Encoding Converter Unicode Converter Universal Unicode 2.0 (16 bit) UTF-161.2 1.2 Unicode 2.0 UTF-8 UTF-81.2 1.2.1 Unicode 2.0 UTF-7 UTF-71.2 N/A Unicode 1.1 (16-bit) UNICODE 1-11.2 1.2 Unicode 1.1 UTF-8 UNICODE-1-1-UTF-81.2 1.2.1 Unicode 1.1 UTF-7 UNICODE-1-1-UTF-71.2 N/A Western European languages ASCII US-ASCII1.2.1 1.2.1 ISO 8859-1 (Latin-1) ISO-8859-1,latin11.2.1 1.2.1 CP 1252 (Windows Latin-1) windows-1252,cp1252ISO 8859-1, plus additions in C1 area 1.2 1.2 CP 437
(DOS Latin-US)cp4371.2 1.2 CP 850
(DOS Latin-1)cp8501.4 1.4 Mac OS Roman mac,macintosh,x-mac-roman1.2 1.2 Mac OS Icelandic x-mac-icelandicbased on Mac OS Roman 1.2 1.2 Mac OS Latin-1,
Mac OS Mailx-mac-latin1(commonly sent as ISO-8859-1)Mac OS Roman permuted to align with 8859-1 1.2 1.2 NextStep Latin 1.2 1.2 CP 037 (EBCDIC-US)
cp037ISO 8859-1 repertoire, different layout 1.2.1 1.2.1 Arabic ISO 8859-6
(Latin/Arabic)ISO-8859-6,arabic1.2 1.2 CP 1256
(Windows Arabic)windows-1256,cp1256Partly 8859-6, plus C1 additions 1.2 1.2 CP 864 (DOS Arabic) cp864Encodes Arabic presentation forms 1.2 1.2 Mac OS Arabic x-mac-arabic1.2 1.2 Mac OS Farsi x-mac-farsi1.2 1.2 Central European languages ISO 8859-2 (Latin-2) ISO-8859-2,latin21.2 1.2 CP 1250 (Windows Latin-2) windows-1250,cp 1250Partly 8859-2, plus C1 additions 1.2 1.2 Mac OS Central
European Romanx-mac-centraleurroman1.2 1.2 Mac OS Croatian x-mac-croatianBased on Mac OS Roman 1.2 1.2 Mac OS Romanian x-mac-romanianBased on Mac OS Roman 1.2 1.2 Chinese GB 2312-80 1.2 N/A EUC-CN GB2312,X-EUC-CNASCII + GB 2312- 80 (8-bit) 1.2 1.2 CP 936
(DOS and Windows Simplified)Similar to GBK 1.4 1.4 Mac OS
Chinese SimplifiedBased on EUC-CN 1.2 1.2 ISO 2022-CN ("GB") ISO-2022-CNASCII +
GB 2312-80 (7-bit)
(see RFC1922)1.2 N/A HZ HZ-GB-2312ASCII + GB 2312-80 (7-bit) (see RFC1842); 1.2 N/A GBK (extended GB) EUC-CN + Unihan repertoire (8-bit) 1.2 1.2 CNS 11643 plane 1 x-cns11643-1N/A N/A CNS 11643 plane 2 x-cns11643-2N/A N/A EUC-TW X-EUC-TWASCII + CNS 11643-1992 (8-bit) 1.2 1.2 Big-5 Big5(8-bit) 1.2 1.2 CP 950
(DOS and Windows Traditional)Based on Big-5 1.4 1.4 Mac OS
Chinese TraditionalBased on Big-5 1.2 1.2 CCCII N/A N/A EACC N/A N/A Cyrillic ISO 8859-5
(Latin/Cyrillic)ISO-8859-5,cyrillic1.2 1.2 KOI8-R KOI8-RSee Rfc 1489 1.2 1.2 CP 1251
(Windows Cyrillic)windows-1251,cp1251Not based on ISO 8859-5 1.2 1.2 CP 866
(DOS Russian)cp866N/A N/A Mac OS Cyrillic x-mac-cyrillic1.2 1.2 Mac OS Ukrainian x-mac-ukrainianMac OS Cyrillic with two replacements 1.2 1.2 Greek ISO 8859-7 ISO-8859-7,greek1.2 1.2 ISO 5428 ISO_5428:1980N/A N/A CP 1253
(Windows Greek)windows-1253,cp1253Nearly 8859-7, plus C1 additions 1.2 1.2 Mac OS Greek x-mac-greek1.2 1.2 Greek CCITT greek-ccittN/A N/A Hebrew ISO 8859-8
(Latin/Hebrew)ISO-8859-8,hebrew1.2 1.2 CP 1255
(Windows Hebrew)windows-1255,cp1255Mostly 8859-8, plus C1 additions 1.2 1.2 Mac OS Hebrew
(2 variants)x-mac-hebrew1.2 1.2 Indic ISCII-91 Parallel encodings for all Indic scripts N/A N/A Mac OS Gujarati 1.2 1.2 Mac OS Devanagari 1.2 1.2 Mac OS Gurmukhi 1.2 1.2 Japanese JIS X0208 1.2 N/A JIS X0212 N/A N/A EUC-JP EUC-JP,X-EUC-JPJIS 201 + JIS 208 + JIS 212 (8-bit) 1.2 1.4 ISO 2022-JP ("JIS") ISO-2022-JPJIS 201 + JIS 208 + JIS 212 (7-bit); Rfc 1468 1.2 N/A Shift-JIS Shift_JIS,x-sjis,x-shift-jisJIS 201 + JIS 208 (8-bit) 1.2 1.2 CP 932
(DOS + Windows)Based on Shift-JIS 1.4 1.4 Mac OS Japanese Based on Shift-JIS 1.2 1.2 Korean KSC 5601-1987 1.2 N/A EUC-KR EUC-KRASCII + KSC 5601-87 (8-bit); Rfc 1557 1.2 1.2 CP 949
(DOS + Windows)Unified Hangul Code: EUC-KR + Johab N/A N/A Mac OS Korean Based on EUC-KR 1.2 1.2 ISO 2022-KR ("KSC") ISO-2022-KRASCII + KSC 5601-87 (7-bit): Rfc 1557 1.2 N/A KSC 5700 N/A N/A Symbols encoding Adobe Symbol Adobe-Symbol-EncodingN/A N/A Mac OS Symbol x-mac-symbolBased on Adobe Symbol 1.2 1.2 Mac OS dingbats x-mac-dingbatsBased on Adobe Zapf Dingbats 1.2 1.2 Thai TIS 620-2533 N/A N/A CP 874
(DOS + Windows)cp874Based on TIS 620-2533 1.4 1.4 Mac OS Thai x-mac-thaiBased on TIS 620-2533 1.2 1.2 Turkish ISO 8859-9 (Latin-5) ISO-8859,latin51.2 1.2 ISO 8859-3 (Latin-3) ISO-8859-3N/A N/A CP 1254
(Windows Latin-5)windows-1254,cp12541.2 1.2 Mac OS Turkish x-mac-turkishBased on Mac OS Roman 1.2 1.2 Vietnamese VISCII VISCIIRfc 1456 N/A N/A TCVN-n N/A N/A